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Algorithms  for  managing  Jobstreams  In  a comole*  computer 
environment  often  rely  on  various  estimates  of  job  run  times, 
hie  to  wide  variability  of  run  times  from  one  execution  of  a job 
to  another,  point  estimations  of  run  times  are  fairly  unreliable. 

An  alternate  approach  to  using  point  estimations  Is  to  use  intervals 
which  span  the  range  of  possible  run  time  values.  In  an  Interval 
approach  run  times  can  be  predicted  with  respect  to  membership  in 
one  of  a limited  set  of  run  time  intervals,  with  relatively  high 
confidence.  This  paper  presents  a formal  methodology  for  run  time 
estimation  based  on  an  interval  approach.  The  estimation  is  done 
using  signature  table  analysis  and  is  accompanied  by  a statement  of 
statistical  confidence  In  the  results. 

Key  words:  Interval  estimation;  point  estimation;  run  time  prediction; 
signature  table  analysis. 
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1.  Introduction 

Algorithms  for  managing  jobstreams  In 
a complex  computer  environment  often  rely 
on  various  estimates  of  Job  run  times. 

Typical  run  times  of  Interest  Include 
response  time,  processing  time,  turnaround 
tlaie  and  so  on.  For  examote.  scheduling 
algorithms  which  tend  to  minimize  average 
Job  turnaround  time  based  on  the  shortest* 
processing-time  principle  often  rely  on  a 
prediction  of  what  the  Job  processing  time 
will  be.  In  systems  which  have  a large 
degree  of  multiprogramming,  run  times  for  a 
particular  Job  vary  widely  from  one 
execution  to  another,  deoer.dlig  upon  the 
number  and  kinds  of  jobs  that  are  simulta- 
neously contending  for  resources.  Prediction 
of  run  times,  therefore,  although  fairly 
accurate  "on  the  average."  tends  to  be 
unreliable  in  any  single  instance  because  of 
the  Inherent  complexity  of  the  processing 
environment. 

An  alternate  aooroach  to  using  point 
estimations  of  run  times,  with  their 
Inevitably  large  variability  and  low  con- 
fidence. Is  to  use  Intervals  which  sran  the 
range  of  oosslble  run  time  values.  In  an 
Interval  approach,  run  times  are  predicted 
with  resoect  to  projected  membership  in  one 
of  a limited  set  of  run  time  intervals.  The 

'This  research  was  supported  by  the  U.S.  Army 
Research  Office  under  grant  number  CAAG29- 


potential  advantages  of  this  technique  are 
that  In  some  environments  prediction  can  be 
done  based  on  very  little  knowledge  about  a 
Job,  and  the  confidence  of  predicting  mem- 
bership in  the  correct  Interval  can  be  very 
high.  The  usefulness  of  this  interval 
approach  has  long  been  recognized  in  the 
computer  comuntty  and  several  ad  hoc  imple- 
mentations exist.  The  classification  of  jobs 
In  IBM's  job  preprocessor  called  HASP,  for 
examole,  has  been  achieved  in  seme  instal- 
lations by  placing  Jobs  In  classes  A.  3,  C 
and  so  on,  based  on  user  suoplied  estimates 
of  resource  requirements.  Essentially, 
these  classes  represent  predicted  run  time 
Intervals  for  their  respective  members. 

This  paper  presents  a formal  methodology 
for  run  time  estimation  based  on  an  interval 
approach.  The  estimation  is  done  using  sig- 
nature table  analysis  and  is  accompanied  by 
a statement  of  statistical  confidence  in  the 
results.  It  may  be  true  that  for  very  com- 
plex systems,  subjective  (or  even  random) 
estimation  is  the  best  method.  This  paper 
discusses  the  improvement  possible  on  sub- 
jective "guesstimates." 

2.  General  Background: 

Signature  Table  Analysis 

Run  time  estimation  for  single  computer 
systems  Is  an  imoortant  performance  question 
which  can  be  'omulated  In  the  following  way: 
given  i soecified  comouter  nardware  snd  soft- 
ware configuration,  and  a workload  wnich  is 
composed  of  a series  of  jobs  to  be  run  on 


that  system,  what  characteristics  of  the  Jobs 
can  best  be  used  to  predict  their  respective 
run  times.  More  specifically,  let  a computer 
system  workload,  U,  consist  of  a series  of 
n Jobs,  Pi,  1*1,  ....  n,  and  assume  there 

extsts  a set  of  m descriptors  di,  d? 

4*  for  each  Pi  wnicn  characterise  tR*t  Job's 
behavior.  Then,  the  question  of  Interest  Is 
which  subset(s)  of  these  descriptors  can  be 
best  used  to  predict  an  additional  or  “key" 
descriptor,  namely  run  time,  and  wnat  Is  the 
particular  function  of  the  critical  descrip- 
tors which  yields  this  prediction. 

The  nature  of  the  run  time  prediction 
problem  and  the  motivation  for  developing 
certain  kinds  of  methodologies  for  its 
solution  can  be  illustrated  by  placing  the 
problem  In  the  context  of  a large,  pro- 
duct  ion-oriented  computer  system.  In  this 
case,  a certain  number  of  oroductlon  Jobs 
are  being  run  on  a regular  basis  — daily, 
weekly,  monthly,  and  so  on.  These  pro- 
duction jobs  often  consist  of  several 
different  programs  (for  example,  payroll 
runs  which  include  not  only  the  relevant 
salary  calculations,  but  also  check- 
writing routines  and  summary  reoort  routines), 
and  require  a variety  of  system  resources. 
Further,  due  to  security  and  deadline  con- 
straints, they  are  often  run  on  a dedicated 
system.  The  production  jobs  are  completely 
specified  and  their  characteristics  wltn 
respect  to  development,  maintenance  and  run- 
time behavior,  dlt  d2 4,.  can  be  de- 

termined (n  most  instances.  Sow  if  a new 
production  Job,  Is  proposed  for  Imple- 
mentation on  the  existing  system,  the  speci- 
fication of  its  required  turnaround  time 
becomes  a critical  factor  upon  which  to 
base  the  decision  to  allow  or  disallow  It. 

Some  subset  of  the  projected  behavior 
characteristics  of  may  be  known,  and 
resources  may  be  available  to  Investigate 
others.  In  order  to  estimate  the  Job's 
turnaround  time.  The  questions  of  which 
characteristics  are  most  important  in  pre- 
diction what  form  the  predictor  should  take, 
and  with  what  confidence  the  prediction  can 
be  made,  must  then  be  addressed. 

The  computer  run  time  prediction  pro- 
blem can  be  formulated  in  terminology  that 
makes  the  application  of  a pattern  recog- 
nition technique  called  signature  table 
analysis  appear  extremely  appropriate. 

In  essence,  this  technique  deals  with 
manipulating  a set  of  data  which 
possesses  a finite  number  of  discrete 
features,  as  well  as  a "key"  feature. 

Analyses  are  performed  on  a “training 
sample"  for  which  values  of  all  the  features, 
including  the  key  feature,  are  known.  Pre- 


diction of  the  key  feature  is  explored,  by 
means  of  the  specification  of  a derived  (com- 
bined) feature  set  which  approximates  the 
key  feature  on  the  training  data.  The  de- 
rived feature  set  can  then  be  applied  to 
other  sets  of  data  for  which  the  key  feature 
must  be  predicted. 

Typically,  In  a run  time  prediction 
environment,  a training  sample  or  set  of 
data  Is  collected  which  consists  of  a finite 
number  of  workload  characteristics,  like  CPU, 
I/O  and  core  resource  requi rements , along  with 
the  known  turnaround  time  of  already  existing 
production  jobs.  Turnaround  time  prediction 
may  then  be  conceptual  iced  as  the  problem  of 
Identifying  the  significant  "features"  among 
the  d*  which  best  describes  a job's  turn- 
around time  "pattern". 

The  signature  table  method  of  pattern 
recognition  suggested  by  Samuel  ( SAM67 ) for 
use  in  machine  learning  problems,  and  further 
developed  by  Page  (PAG7S)  is  a hierarchical 
approach  for  thd  recognition  of  patterns 
which  are  described  in  terms  of  many  features. 
The  method  provides  a means  by  which  features 
are  exhaustively  analyzed  in  subsets,  each  of 
which  provides  a derived  feature.  The  de- 
rived features  are  combined  to  result  In 
higher  derived  features  which  depend  in  a 
nonlinear  manner  on  all  of  the  original 
features.  An  example  of  the  tabular  structure 
which  may  result  from  applying  the  method  to 
four  features  is  shown  in  Figure  1.  (Figure 
l is  discussed  in  more  detail  below.) 

The  major  advantages  of  the  signature 
table  method  over  other  prediction  tech- 
niques. and  those  that  render  it  esoecially 
applicable  to  the  run  time  prediction  pro- 
blem are: 

1)  the  quality  of  prediction  is 
Improved  as  more  independent  features  or 
descriptors  are  used  (this  is  in  contrast 
to  some  techniques  possessing  the  counter- 
intuitive property  that  for  a finite-sized 
training  sample  there  is  an  optimal  nunber 
of  features), 

2)  it  provides  a natural  way  to  deal 
with  missing  data, 

J)  It  allows  the  analyst  to  Introduce 
personal  knowledge  and  Intuition  about  the 
system  Into  the  calculation  process  (this 
capability  may  greatly  reduce  the  amount  of 
computation  required;  It  Is  comparable  to 
the  analyst's  capability  In  the  design  of 
fractional  factorial  experiments  to  indicate 
which  variable  Interactions  are  Important 
and  which  are  not), 


Figure  1.  Signature  Table*  for  One  Combination  of  Four  binary  Feature* 
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4)  In  many  cases  It  provides  better 
prediction  than  multiple  regression,  at 
less  cost;  this  is  true  In  part  due  to  the 
use  of  an  Interval  estimate  aporoacn  rather 
than  a point  estimate  approach  as  previously 
discussed,  and 


5) 

formats : 


It  Is  applicable  to  data  In  all 
numeric,  symbolic,  ordinal  and 


graphical. 

The  heuristics  developed  by  Page  to 
implement  this  technique  have  been  specified 
for  binary  (.l.e.,  two-valued)  feature  values, 
and  therefore  for  the  recognition  of  binary 
patterns.  Essentially,  the  methodology 
requires  the  following  steps:  First, 

1.  Determine  the  appropriate  predictor 
features. 

2.  Determine  the  appropriate  cutpolnt 
of  each  feature,  using  measures  of  minimum 
entropy  (Information  loss)  and  maximum  pre* 
diction.  Cut-points  are  needed  to  dis- 
cretize continuous  features  and  to  manipu- 
late the  allowed  number  of  discrete  values 
of  each  feature.  Then  Iteratively,  at  each 


(derived)  feature  level,  until  a single 
derived  feature  Is  obtained, 

3.  Determine  feature  subsets  or  "sig- 
nature types"  upon  which  derived  features 
are  to  be  based.  In  Figure  1,  for  examole, 
features  dj  and  d?  are  combined  to  derive 
feature  Dij*  and  Features  dj  and  da  are 
combined  to  derive  feature  034.  Sut  various 
other  combinations  are  possible. 

4.  Define  the  derived  feature  resulting 
from  the  respective  combined  features,  using 
an  appropriate  quantization  method.  This 
method  Is  symbolized  by  the  f(’s  In  Figure  1. 

Finally, 

5.  Extract  the  relationship  between 
the  original  features  and  the  derived  feature 
as  Boolean  expressions  which  describe,  with 
some  known  probability,  relations  inherent  In 
the  data.  This  process  Is  Illustrated  in 
Figure  1 for  the  derived  feature  Dj234 • The 
potential  usefulness  of  such  an  exoresslon 

In  run  time  prediction  becomes  aooarent 
when  one  observes  that  only  any  three  of  the 
four  feature  values  need  be  obtained  to 


calculate  the  0173a  value.  Hence,  the  slg- 
nature  table  method  is  a way  of  using  the 
data  to  evolve  switching  functions  which 
discriminate  between  members  of  various 
classes  (or  binary  values  of  the  key  feature 
In  Page's  work). 


3.  A Sample  Application 


The  signature  table  analysis  approach 
was  used  to  solve  a run  time  prediction  pro- 
blem for  the  U.  S.  Army.  A description  of 
the  application  of  the  method  to  the  Army 
data  will  serve  to  demonstrate  its  basic 
simplicity  and  Its  effectiveness  in  achieving 
the  objective  of  interval  estimation  of  run 
time  with  a relatively  high  degree  of  con- 
fidence. 


3.1.  Division  of  the 
Key  features  into  Intervals 


The  key  feature,  turnaround  time,  was 
divided  Into  three  Intervals:  less  than  10 
minutes,  10-20  minutes  and  more  than  20 
minutes.  These  intervals  appeared  to  be 
natural  divisions  In  the  data  and  were  not 
chosen  based  on  any  statistical  considera- 
tions of  appropriateness.  Also,  they 
seemed  to  be  reasonable  intervals  for  use 
In  the  decision  making  process  wnich  would 
follow  turnaround  time  predict1on--namely , 
whether  or  not  to  allow  the  broposed  Job 
to  be  developed  and  supported. 


The  three  Intervals  were  defined  by 
two  cutoolnts.  600  seconds  (10  minutes) 
and  1200  seconds  (20  minutes),  for  experi- 
mental purposes  eacn  of  these  cutpolnts  was 
Investigated  in  a separate  stage,  first, 
boolean  functions  were  derived  to  estimate 


If  a Job's  run  time  would  be  less  than  or 
greater  thai  600  seconds.  Then  another  set 
of  functions  was  derived  to  estimate  If  a 
job's  run  time  would  be  less  than  or  greater 
than  1200  seconds.  The  functions  which  pre- 
dicted with  the  highest  accuracy  from  each 
set,  based  on  the  training  sample  data,  were 
then  combined  to  derive  a single  function. 

As  will  be  described  later  In  the  analysis  of 
the  results,  this  single  function  was  used 
to  predict  Into  which  of  the  three  Intervals 
a job's  turnaround  time  fell. 


3.2.  Outpoint  Specification 
for  Predictor  features 


The  purpose  of  the  Army  study  was  to 
develop  a predictor  for  the  total  turn- 
around (TA)  time  of  a oroposed  application 
(production)  job,  based  on  a set  of  projected 
job  resource  requirements.  Oata  for  the 
development  of  tne  predictor  consisted  of 
412  observations  on  currently  running  pro- 
duction Jobs.  A single  observation  was 
provided  in  the  form  of  a S-tuole,  V • 

(CPU  time,  turnaround  time,  punch  1/0, 
tape  1/0,  disk  I/O)  The  data  were  divided 
Into  a training  sample  of  234  observations 
and  a test  sample  of  123  observations. 


The  experiment  was  broken  uo  Into  4 
steps:  1)  division  of  tne  key  feature 
(turnaround  time)  Into  intervals.  2)  cutoolnt 
specification  for  predictor  features,  3) 
computation  of  derived  features  and  4) 
analysis  of  the  results.  These  steps  are 
discussed  (n  turn  below. 


Each  of  the  predictor  features  was  dis- 
cretized Into  two  ranges  for  each  of  the  two 
experimental  stages,  a "low"  and  a 'high" 
range.  All  of  tne  predictor  features  were 
positively  correlated  with  the  key  feature 
In  that  a low  predictor  feature  value  'pre- 
dicted' a low  turnaround  time  and  a hign 
predictor  feature  value  'predicted'  a hign 
turnaround  time.  Given  a particular  key 
feature  cutpoint,  tne  predictor  feature  cut- 
points  were  chosen  so  as  to  minimize  the 
total  number  of  incorrect  key  feature  pre- 
dictions. Cutpoints  were  determined  using 
the  Statistical  Package  for  the  Social 
Sciences  (SPSS)  (NIE75).  Basically,  fre- 
quency tables  of  the  form  shown  In  Figure  2 
were  computed  for  different  possible  predic- 
tor feature  cutpoints.  The  value  for  wnich 
(b+c)  was  minimized  was  selected  as  the  cut- 
point  value.  Table  1 contains  the  cutpoints 
whim  were  computed  for  the  two  stages  of 
the  experiment.  Also  tabulated  are  the 
nunoer  and  percentage  of  the  234  training 
sample  observations  which  were  incorrectly 
predicted  using  each  cutpoint.  Note  tnat 
even  the  best  cutpoint  value  in  certain  cases 
resulted  In  a large  Percentage  of  incprrect 
key  feature  predictions.  This  Is  due  to  a 
predictor  feature's  inability  to  single- 
handedly  forecast  the  pattern  of  job  turna- 
round time. 


3.3.  Computation  of 
Derived  Features 


The  comoutatlon  of  derived  features  has 
teen  described  and  analyzed  In  (P,\c?5].  A 
description  of  the  steps  followed  In  this 
study  will  be  provided  here.  In  general, 
predictor  features  are  combined  to  produce 
second  level  derived  features.  These  in 
turn  are  combined  to  produce  higher  level 
features.  The  process  terminates  when 
enougn  of  the  original  predictor  features 
have  been  used  to  produce  higher  level 
features  which  can  predict  the  key  feature's 
interval  value  with  a high  degree  of  accuracy. 


Derivation  of  Predictor  futurt  Outpoints 
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low hi*h 


Predictor 
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Table  1.  Predictor  Feature  Outpoint  Value* 


Feature 


?1 

CPU  Time 

VI 

Punch  t/O 

V* 

Tap*  t/O 

V5 

Dltk  t/O 

1 

TA  Tlaa  Outpoint: 

600  Second* 

TA  Tine  Cutpolnt: 

1200  Second* 

Feature 

Outpoint 

hunker  of 
Incorrect 
Predict  Ion* 

: 

Incorrectly 

Predicted 

Feature 

Cutpolnt 

Munber  of 
Incorrect 
Prediction* 

: 

Incorrectly 

Predicted 

60.0 

23 

1.11 

160.0 

** 

13.  ST 

1.0 

123 

*3.  It 

20.0 

108 

38.  OX 

*00.0 

31 

18. 75 

1110. 0 

37 

20.  IX 

1.0 

28 

».« 

1600.0 

128 

*3. IT 

Predictor  features  were  combined  In 
pairs.  Since  tech  Feature  had  Been  divided 
Into  a low  and  high  range  by  a feature  cut* 
point,  there  were  four  possible  combi- 
nation!: 1 ow : 1 ow . low:hiqh,  hign:1owand 
Mgh:h1gh.  For  purpose!  of  computational 
ease,  low  was  represented  by  0 and  high  by 
1.  Once  again  using  SPSS,  frequency  tables 
were  computed  to  determine  how  many  high 
and  low  key  feature  values  existed  in  the 
training  sample  'or  eacn  combination.  For 
each  combination  the  prooortion  of  high 
values  of  the  key  feature,  p».1(,n  was 
compared  to  the  proportion  of  ni$n  values 
of  the  key  feature  in  the  entire  training 
sample,  tf  the  first  prooortion  «as  larger, 
then  It  was  judged  that  that  combination 


predicted  a high  key  feature  value:  other- 
wise, a low  key  feature  value  was  predicted. 

Two  examples  of  derived  features,  one  for 
the  turnaround  time  outpoint  of  6CC  seconds 
and  one  for  the  turnaround  time  cutooint  of 
1200  seconds,  are  provided  in  Table  2.  It 
can  be  seen  that  a derived  feature  can  be 
expressed  as  a boolean  function  or  combi- 
nation of  the  two  features  from  which  It  was 
derived.  In  Table  2a.  both  Taoe  1,0  and  Cisk 
I/O  had  to  be  I (high)  for  the  derived  feature 
to  be  1.  Consequently,  the  derived  feature 
Is  equivalent  to  the  Boolean  expression  Taoe 
I/O  Oisk  I/O,  or  more  conveniently . 

V*  ,\  VS.  likewise  the  boolean  expression 
derived  in  Table  2b  is  CPU  time,  or  simply  Vi. 


The  process  for  combining  futures  was 
then  repeated,  this  time  combining  the 
derived  feetures.  Evertaully,  several  final 
boolean  expressions  for  both  turnaround  time 
outpoints  were  determined,  all  of  which  were 
derived  from  at  least  three  of  the  four 
original  predictor  features. 

3.4.  Analysis  of  Results 

The  final  boolean  expressions  derived 
for  each  turnaround  time  cutoolnt  are  pre- 
sented  In  Table  3.  for  each  expression,  the 
maaber  and  percentage  of  correct  and  in- 
correct  predictions  have  been  tabulated. 


Based  on  the  accuracy  of  prediction  for  the 
training  samote.  It  was  concluded  that  the 
variable  VI  (CPU  time)  was  the  best  pre- 
dictor of  turnaround  time  for  sotn  cutoomts. 
!t  should  be  remembered  that  the  variable 
Vi  used  to  predict  below-abovc  10  minutes  is 
slightly  different  from  the  variable  vi  used 
to  predict  below-above  CO  minutes  inasmuch 
as  different  predictor  feature  outpoints 
were  calculated  for  each.  The  labels  V 
and  V 1x200  ir*  employed  below  to  differentiate 
Sec  ween  ent  two. 

The  variables  Vl$oo  and  Vli  ;n<j  were 
combined  to  derive  a predictor  of  ill  three 


turnaround  time  intervals.  This  predictor 
Is  a set  of  boolean  expressions  basei  on 
four  variable  values: 

1-  vi600  • CPU  time  i 60  seconds 

*•  Vi600  * CPU  time  < 60  seconds 

3.  Vl1200  -CPU  time  2160  seconds 

4.  "vI\.-*co  '*CP'J  seconds 

These  variables  were  combined  to  form  the 
turnaround  time  predictions : 

YlgOO  A ¥^1200  ~7A  ',sl  than  W minutes 

*1600  A VI i2qo  * ’A  between  10  and  20  mins. 

VI500  A vl1200  * TA  9r*at«r  than  20  mins. 

¥^600  aV1i;qq  - no  prediction 

The  last  combination  is  contradictory  since 
¥*600  implies  CPU  tine  less  than  60  seconds 
and  Vli200  implies  CPU  time  greater  than  or 
equal  to  160  seconds.  This  combination  was 
defined  to  be  an  automatic  Incorrect  predic- 
tion. (As  it  turned  out.  none  of  the  test 
or  training  samole  data  had  this  combination, 
an  indication  of  the  consistency  of  the 
separately  derived  expressions.) 


finally  the  accuracy  of  the  predictor 
was  estimated  using  the  set  of  test  data. 
Since  turnaround  times  were  availaole  for 
the  test  data,  it  was  possible  to  get  an 
estimate  of  Paccuracy*  the  proportion  of 
accurate  predictions’ using  the  predictor. 

A sumnary  of  the  actual  turnaround  times 
versus  the  predicted  turnaround  times  is 
presented  in  Table  A.  The  left-to-right 
diagonal  cells  represent  correct  prediction 
Since  the  predicted  interval  is  the  same  as 
the  actual  interval.  Othercells  represent 
incorrect  predictions. 

Of  the  128  test  values,  101  observations 
were  correctly  predicted,  thereby  providing 
an  estimate  of  the  overall  predictor  accu- 
racy , Paccuracy  of  -739-  An  approximate 
951  confidence  Interval  for  Paccuracy  "l4S 
calculated,  using  a normal  approximation, 
to  be  (.713,  .360).  In  almost  all  cases 
(98. At  of  the  time),  the  prediction  was 
either  correct  or  within  one  interval.  That 
is,  seldom  did  the  predictor  predict  less 
than  10  minutes  w nen  the  actual  turnaround 
time  was  greater  than  20  minutes,  and  vice 
versa. 


Table  3.  Accuracy  of  final  Boolean  Expreeaions  on  Training  Sample 


Boolean  Exprtselon 

TA  Time  Outpoint: 

10  alnutaa 

VI 

VI  A V*  A VS 

TA  Time  Outpoint: 

20  alnutaa 

VI 

V4 

Correct 

Low 


Incorrect 

Low 


Table  4.  Estimated  Accuracy  of  the  Predictor 

Actual  TA  Tina  1 seconds) 

0-600  600-1100  i:C0  ♦ Row  Total 


50 


62 


43  70  i:« 


a:  No.  of  TA  valuas  predicted  to  fall  into 
Interval  J which  had  actual  TA  values 
in  Interval  I 


b:  t of  all  J Interval  predictions  which 
fell  inco  interval  l 


Coluan  IS 

Total 


Legend  Interval 

I 


Interval 

J 


13 

2 

1 

0-600 

SI. 3 

12.3 

6.3 

66.7 

4.7 

1.4 

l 

34 

15 

600-1200 

2.0 

68.0 

30.0 

Predicted 
TA  Time 
(seconds) 

6.7 

79.1 

21.4 

1 

7 

54 

1200  ♦ 

1.6 

11.3 

87.1 

6.7 

16.3 

77.1 

c:  t of  actual  interval  I values  which 
were  predicted  to  be  in  interval  J 


4.  Conclusions 


fXi«  to  the  large  variability  In  Job 
run  times  from  one  execution  to  another, 
point  estimates  of  run  times  are  unreli- 
able. Interval  estimation  of  run  times  is 
a reasonable  aporoacn  to  obtaining  run  time 
predictions  in  wmch  a nlgher  confidence 
can  be  placed.  The  application  of  signature 
table  analysis  to  tne  prediction  of  turna- 
round time  In  one  particular  environment 
has  yielded  a predictor  that  was  simple  to 
develop,  is  simple  to  use,  and  is  accurate 
about  3Ct  of  the  time  in  predicting  member- 
ship In  one  of  three  turnaround  tine  classes. 
Although  this  interval  approach  technique 
will  not  provide  sufficient  predictive  power 
for  all  applications,  it  is  appropriate  for 
some  application  objectives  and  ;*pu!d  be 
considered  as  a desirable  alternative  to 
less  statistically  sound  approaches. 
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